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Abstract — We study the problem of sampling a random signal 
with sparse support in frequency domain. Shannon famously 
considered a scheme that instantaneously samples the signal at 
equispaced times. He proved that the signal can be reconstructed 
as long as the sampling rate exceeds twice the bandwidth (Nyquist 
rate). Candes, Romberg, Tao introduced a scheme that acquires 
instantaneous samples of the signal at random times. They proved 
that the signal can be uniquely and efficiently reconstructed, 
provided the sampling rate exceeds the frequency support of the 
signal, times logarithmic factors. 

In this paper we consider a probabilistic model for the signal, 
and a sampling scheme inspired by the idea of spatial coupling in 
coding theory. Namely, we propose to acquire non-instantaneous 
samples at random times. Mathematically, this is implemented by 
acquiring a small random subset of Gabor coefficients. We show 
empirically that this scheme achieves correct reconstruction as 
soon as the sampling rate exceeds the frequency support of the 
signal, thus reaching the information theoretic limit. 

I. Introduction 

Sampling a continuous signal is a key component of digital 
communication systems [12]. Despite the foundations of sam- 
pling theory were laid down by Shannon [ 15 1, its mathematical 
formulation is not probabilistic, and bears little resemblance 
to the one of information theory. In this paper we consider an 
information-theoretic formulation of sampling. 

A. Definitions 

For the sake of simplicity, we consider a discrete-time model 
(analogous to the one of (3)) and denote signals in time 
domain as a; € C", x — (x(i))i<t< n = (x(l), . . . ,x(ri)). 
Their discrete Fourier transform is denoted by x G C™, 
x = (x(uj)) uje Q n , where Q n = {u = 2-Kk/n : k G {0, 1, . . . , 
n — 1}}. The Fourier transform x = (Fx) is given by 

n i 

x{u) = {b u ,x)=Y j b u {t)x{t), ft^Js—e*". (1) 

Here ( • , • ) denotes the standard scalar product on C". Also, 
for a complex variable z, z is the complex conjugate of z. 
Notice that (6 w ) we si n is an orthonormal basis of C". This 
implies Parseval's identity (xi,X2) = (2:1, £2)- In addition, 
the inverse transform is given by 

x(t) = V x(u) b u (t) = -^J2 *M e ^ ■ < 2 ) 
L — ' Jn * — ' 

We will denote by T n — {1, . . . , n} the time domain, and will 
consider signals that are sparse in the Fourier domain. 

A sampling mechanism is defined by a measurement matrix 
A G M mx ". Measurements y = (y(l), . . .,y(m)) G R m are 



given by 

y = Ax + w = y + w, (3) 

where id is a noise vector with variance a 2 , and yo is the 
vector of ideal (noiseless) measurements. In other words, 
y(i) = (a,i,x) where we let a*,...a^ be the rows of A. 
Instantaneous sampling corresponds to vectors a,; that are 
canonical base vectors. 

Measurements can also be given in terms of the Fourier 
transform of the signal: 

y = Afx + w, A f = AF* . (4) 

The rows of Af are denoted by a*,. . . ,a^, and obviously 
Sj = Feti. Here and below, for a matrix M, M* is the 
hermitian adjoint of M, i.e. M*j = Mji . 

B. Information theory model 

In [3 1, Candes, Romberg, Tao studied a randomized scheme 
that samples the signal instantaneously at uniformly random 
times. Mathematically, this corresponds to choosing the mea- 
surement vectors <Zj to be a random subset of the canonical 
basis in C™. They proved that, with high probability, these 
measurements allow to reconstruct x uniquely and efficiently, 
provided m > C\S\ logn, where S = {ui G ft : x(oj) 7^ 0} is 
the frequency support of the signal. 

In this paper, we consider a probabilistic model for the 
signal x, namely we assume that the components x{lS), ugfi 
are i.i.d. with F{x(uj) ^ 0} < e and E{|x(w)| 2 } < C < 00. 
The distribution of x(uj) is assumed to be known. There are 
several reasons for considering such a model. First of all it 
is relevant for applications to digital communications, where 
x corresponds to a coded message. If coding is optimal, in 
information theoretic sense, it can be thought as uniformly ran- 
dom from the codebook. For many purposes this distribution 
is equivalent to our i.i.d. model. Indeed, in the same context of 
digital communications, it is clear that information theoretic 
thinking led to impressive development in error correction, as 
demonstrated by the development of modern iterative codes 
fl3l . More broadly probabilistic models can lead to better 
understanding of limit and assumptions in practical sampling 
methods. 

C. Related work 

The sampling scheme developed here is inspired by the idea 
of spatial coupling, that recently proved successful in coding 
theory J6), lfl6l . |9l , ifTOl and was introduced to compressed 
sensing by Kudekar and Pfister [8|. The basic idea, in this 



context, is to use suitable band diagonal sensing matrices. 
Krzakala et al. [7| showed that, using the appropriate mes- 
sage passing reconstruction algorithm, and 'spatially-coupled' 
sensing matrices, a random fc-sparse signal x G K™ can be 
recovered from k+o(n) measurements. This is a surprising re- 
sult, given that standard compressed sensing methods achieve 
successful recovery from 0(fclog(n/fc)) measurements. 

The results of [7| were based on statistical mechanics meth- 
ods and numerical simulations. A rigorous proof was provided 
in O using approximate message passing (AMP) algorithms 
[5 1 and the analysis tools provided by state evolution 0, 0. 
Indeed, |4| proved a more general result. Consider a non- 
random sequence of signals r") G K™ indexed by the problem 
dimensions n, and such that the empirical law of the entries 
of x~( n \ p~~\t) = n^ 1 Yn=i = 0' converges weakly 

to a limit pg with bounded second moment. Then, spatially- 
coupled sensing matrices under AMP reconstruction achieve 
(with high probability) robust recovery of JW, as long as the 
number of measurements is m > d(pg) + o(n). Here d(p%) 
is the (upper) Renyi information dimension of the probability 
distribution p-g. This quantity first appeared in connection with 
compressed sensing in the work of Wu and Verdu 1171 . Taking 
an information-theoretic viewpoint, Wu and Verdu proved that 
the Renyi information dimension is the fundamental limit for 
analog compression. 

D. Contribution 

Using spatial coupling and (approximate) message pass- 
ing, the approaches of Q, J4] allow successful compressed 
sensing recovery from a number of measurements achieving 
the information-theoretic limit. While these can be formally 
interpreted as sampling schemes for the discrete-time sam- 



pling problem introduced in Section I-A they present in fact 
several unrealistic features. In particular, the entries of A are 
independent Gaussian entries with zero mean and suitably 
chosen variances. It is obviously difficult to implement such a 
measurement matrix through a physical sampling mechanism. 

The present paper aims at showing that the spatial coupling 
phenomenon is -in the present context- significantly more 
robust and general than suggested by the constructions of 
Q, flU. Unfortunately, a rigorous analysis of message passing 
algorithms is beyond reach for sensing matrices with depen- 
dent or deterministic entries. We thus introduce an ensemble 
of sensing matrices, and show numerically that, under AMP 
reconstruction, they allow recovery at undersampling rates 
close to the information dimension. Similar simulations were 
already presented by Krzakala et al. Q in the case of matrices 
with independent entries. 

Our matrix ensemble can be thought of as a modification of 
the one in |3 1 for implementing spatial coupling. As mentioned 
above, [3| suggests to sample the signal pointwise (instanta- 
neously) in time. In the Fourier domain (in which the signal 
is sparse) this corresponds to taking measurements that probe 
all frequencies with the same weight. In other words, Af is 
not band-diagonal as required in spatial coupling. Our solution 
is to 'smear out' the samples: instead of measuring x(t*), we 



modulate the signal with a wave of frequency lj*, and integrate 
it over a window of size W~ x around t*. In Fourier space, this 
corresponds to integrating over frequencies within a window 
W around oj*. Each measurement corresponds to a different 
time-frequency pair (t*,u;*). While there are many possible 
implementations of this idea, the Gabor transform offers an 
analytically tractable avenue. Our method can be thought of as 
a subsampling of a discretized Gabor transform of the signal. 

II. Sampling scheme 

A. Constructing the sensing matrix 

The sensing matrix A is drawn from a random ensemble 
denoted by A4(n, nix, L, £, £, S). Here n, m\,L, £ are integers 
and £, 8 € (0, 1). The rows of A are partitioned as follows: 



(5) 



where |R/-| — L, and |Ro| = [nS\. Hence, m — m\L + [nS\. 
Notice that rn/n = (miL + [nd\)/n. Since we will take n 
much larger than wi\L, the undersampling ratio m/n will be 
arbitrary close to 5. Indeed, with an abuse of language, we 
will refer to S as the undersampling ratio. 

We construct the sensing matrix A as follows: 

1) For each k G {1, ••• ,rni}, and each r G R^, a r = b 27T k/n- 

2) The rows {a r } re R Q are defined as 

a r (t) = a(t;t r ,u r ) , (6) 

where {i r } re R are independent and uniformly random in T n , 
and {ui r } re R are equispaced in f2„. Finally, for G T n , and 
G fi n , we define 

1/2 



a(t;t*,w >t ) 



1 



: P e ,t(h,t), C e = { Pe(t*,t) 2 } 



t6T„ 



Here P^.i(t^,,t) is the probability that a random walk on the 
circle with n sites {1, . . . ,n} starting at time at site is 
found at time I at site t. The random walk is lazy, i.e. it stays 
on the same position with probability 1 — £ e (0,1) and moves 
with probability £ choosing either of the adjacent sites with 
equal probability. 

Notice that the probabilities P^^(t^,t) satisfy the recursion 

+ | Pz At* + 1,*), PzAt*,t) = i(t = u), 

where sums on T n are understood to be performed modulo n. 
We can think of as a discretization of a Gaussian kernel. 
Indeed, for 1 <C t -C n 2 we have, by the local central limit 
theorem, 

l r (t-M 2 l 



(7) 



PiA**,t) 



(2tt^) 1 /2 
(4tt^)- 1 /4. 



exp • 



2££ 



(8) 



and hence Ci 

The above completely define the sensing process. For the 
signal reconstruction we will use AMP in the Fourier domain, 
i.e. we will try to reconstruct x from y = Apx + w. It 
is therefore convenient to give explicit expressions for the 
measurement matrix in this domain. 



1) For each k G {1, • • • ,7774}, and each r € Rfc, we have 
a r = efe, where G M" refers to the fc th standard basis 
element, e.g., ei = (1,0,0, ••• ,0). These rows are used to 
sense the extreme of the spectrum frequencies. 

2) For r G Ro, we have a r (uj) = a(uj;t r ,uj r ), where 

o(w;**,w.) = ^ ^= e -<(<--«.)*. (l-£ + £cos(w-^)) £ . 
Ctsjn 

Again, to get some insight, we consider the asymptotic behav- 
ior for 1 <C I <C n 2 . It is easy to check that a is significantly 
different from only if uj — cj* = 0(£ -1 / 2 ) and 

a(uj; t*, cj*) « — — — cxp { — i(uj — — —£t(w — w*) 2 } 

Ulyn y. 2 J 

Hence the measurement depends on the signal Fourier 
transform only within a window of size W = 0(l- 1 ' 2 ), with 
1/n <C W <C 1. As claimed in the introduction, we recognize 
that the rows of A are indeed (discretized) Gabor filters. Also 
it is easy to check that Af is roughly band-diagonal with width 
W. 

B. Algorithm 

We use a generalization of the AMP algorithm for spatially- 
coupled sensing matrices flU to the complex setting. Assume 
that the empirical law of the entries of x^ converges weakly 
to a limit po, with bounded second moment. The algorithm 
proceeds by the following iteration (initialized with x\ — 
E{X} for all i G [n]). For x* G C", r* G C m , 



x t+1 = Vt (x t + (Q t QAyr t ), 

r* =y - Ax 1 + b* r*" 1 + d* r 4 " 1 



(9) 



Here rj t (v) = (rk,i(vi), ■ ■ • ,»?t,n(«n)). where : C C is 
a scalar denoiser. In this paper we assume that the prior p-g 
is known and use the posterior expectation denoiser 

-1/2 , 



%,i(vi) =E{X\X 



'Z = Vi} . 



WaiMt)' 1 , 



where X ~ pg and Z ~ Nc(0, 1) is a standard com- 
plex normal random variable, independent of X. Also, f* 
is the complex conjugate of r* and indicates Hadamard 
(entry wise) product. The matrix Q l G M. mxn , and the vector 
b* G 1"' are given by 



i£[n] 



(10) 

(ii) 

(12) 



where = \A ai \ 2 and dj] t>i = dr} tti {x\ + ((Q* A)V)i), 
dr? M = &jm(2* + ((Q* A)*r*)i). Throughout, ^{v) is 
viewed as a function of v, v, and v, v are taken as independent 
variables in the sense that dv/dv — 0. Then, dr) t i and drj t i 
respectively denote the partial derivative of r]t t i with respect 
to v and v. Also, derivative is understood here on the complex 



domain. (These are the principles of Wirtinger's calculus for 
the complex functions [14|). Finally, the sequence {4>(t)}t>o 
is determined by the following state evolution recursion. 

4> a {t + 1) = a 2 + W ai mmse( £ WhM*)' 1 ) ■ ( 13 > 

ie [n] be [to] 

Here mmse( • ) is defined as follows. If X ~ pj^ and y = 
X + s" 1 /^ for Z ~ N c (0, 1) independent of X, then 



imse(s) = - E{ \X - E[X\Y] | 2 } 



(14) 



III. Numerical simulations 

We consider a Bernoulli-Gaussian distribution p^ = (1 — 
e)<5o + £7C> where 7c is the standard complex gaussian mea- 
sure. We construct a random signal (i(w)) lje sj n by sampling 
i.i.d. coordinates x(lj) We have d{p^) = e ifTTl and 



£ 7i 



+"7 



(w<) 



(15) 



where 70-2(2:) = l/(7r(T 2 ) exp{— zz/a 2 } is the density func- 
tion of the complex normal distribution with mean zero and 
variance a 2 . 

A. Evolution of the algorithm 

Our first set of experiments aims at illustrating the spatial 
coupling phenomenon and checking the predictions of state 
evolution. In these experiments we use e = 0.1, a = 0.001, 
5 = 0.15, n = 5000, I = 800, m x = 20, L = 3, and f = 0.5. 

State evolution yields an iteration-by-iteration prediction 
of the AMP performance in the limit of a large number 
of dimensions. State evolution can be proved rigorously for 
sensing matrices with independent entries (2), (TJ. We also 
refer to |4] for a heuristic derivation which provides the right 
intuition in the case of spatially-coupled matrices. We expect 
however the prediction to be robust and will check it through 
numerical simulations for the current sensing matrix Af. In 
particular, state evolution predicts that 

E{|^(y)-^| 2 }«mmse(^^ a ,^- 1 (t-l)) . (16) 

aGR 

Figure [T] shows the evolution of profile 4>{t) G M m , given by 
the state evolution recursion ( [T3] l. This clearly demonstrates 
the spatial coupling phenomenon. In our sampling scheme, 
additional measurements are associated to the first few coordi- 
nates of x, namely, x\, ■ ■ ■ , x mi . This has negligible effect on 
the undersampling rate ratio because m±L/n —> 0. However, 
the Fourier components Xi,--- ,x mi are oversampled. This 
leads to a correct reconstruction of these entries (up to a mean 
square error of order a 2 ). This is reflected by the fact that 
<fi becomes of order a 2 on the first few entries after a few 
iterations (see t = 5 in the figure). As the iteration proceeds, 
the contribution of these components is correctly subtracted 
from all the measurements, and essentially they are removed 
from the problem. Now, in the resulting problem the first 
few variables are effectively oversampled and the algorithm 
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Fig. 1. Profile 4> a {t) versus a for several iteration numbers. 



LU 

a> -3 



oT -4 
o 

-5 
-6 



MSE 
. MSE_ 



AMP 




g x: T X x I Z 







100 



200 



300 
Iteration 



400 



500 



600 



Fig. 2. Comparison of MSEamp and MSEse across iteration. 

reconstructs their values up to a mean square error of a 1 . 
Correspondingly, the profile <fi falls to a value of order a 2 
in the next few coordinates. As the process is iterated, all 
the variables are progressively reconstructed and the profile 
</> follows a traveling wave with constant velocity. After a 
sufficient number of iterations (t = 400 in the figure), cf> is 
uniformly of order a 2 . 

In order to check the prediction of state evolution, we 
compare the empirical and the predicted mean square errors 

MSEamp - -W^iv) (17) 



MSE 



n 

1 ^ 



SE 



mmse 



aeR 



The values of MSEamp and MSEse versus iteration are 
depicted in Fig. [2] (Values of MSEamp and the bar errors 
correspond to M = 30 Monte Carlo instances). This verifies 
that the state evolution provides an iteration-by iteration pre- 
diction of AMP performance. We observe that MSEamp (and 
MSEse) decreases linearly versus iteration. 

B. Phase diagram 

In this section, we consider the noiseless compressed sens- 
ing setting, and reconstruction through different algorithms 
and sensing matrix ensembles. 

Let A be a sensing matrix-reconstruction algorithm scheme. 
The curve e M> S^(e) describes the sparsity-undersampling 



tradeoff of A if the following happens in the large-system 
limit n,m — > oo, with m/n — S. The scheme A does 
(with high probability) correctly recover the original signal 
provided <5 > <5^(e), while for <5 < <5^(e) the algorithm 
fails with high probability. We will consider three schemes. 
For each of them, we consider a set of sparsity parameters 
£ € {0.1,0.2,0.3,0.4,0.5}, and for each value of e, evaluate 
the empirical phase transition through a logit fit (we omit 
details, but follow the methodology described in 0). 

1 ) Scheme I: We construct the sensing matrix as described 



in Section II-A and for reconstruction, we use the algorithm 



described in Section II-B An illustration of the phase transi- 
tion phenomenon is provided in Fig. [4] This corresponds to 
e — 0.2 and an estimated phase transition location 5 = 0.23. 

As it is shown in Fig. [3] our results are consistent with the 
hypothesis that this scheme achieves successful reconstruction 
at rates close to the information theoretic lower bound 5 > 
d(j)j>) = e. (We indeed expect the gap to decrease further by 
taking larger values of £, n.) 

2) Scheme II: The sensing matrix Af is obtained by choos- 
ing m rows of the Fourier matrix F at random. In time domain, 
this corresponds to sampling at m random time instants as in 
[ 3 1 . Reconstruction is done via AMP algorithm with posterior 
expectation as the denoiser r\. More specifically, through the 
following iterative procedure. 



Vt(x tj r 
y - Ax 1 



i 



,t-i 



(dr)) 



1 



(19) 



Here r) t (v) = (»/t(ui), 
(t>\ Z — Vi} and Z ? 

dr] = 377(5;* + A*r r ) and for a vector u 6 

— l v^™ 



.,r)t(v n )), where r) t {vi) = E{X\X + 
N c (0, 1). Also drj = dr]{^ + A*r*), 

, (u) = 



The sequence 4> t is determined by state evolution 
4>t+i = ^mrr 



K 1 



(20) 



When A has independent entries Ajj ~ N(0, 1/m), state 
evolution ( |20] > predicts the performance of the algorithm ( fT9| ) 
0. Therefore, the algorithm successfully recovers the original 
signal with high probability, provided 



5 > 6(e) = sups • mmse(s) 

s>0 



(21) 



As shown in Fig. [5] the empirical phase transition for scheme 
II is very close to the prediction 5(e). Note that schemes 
I, II both use posterior expectation denoising. However, as 
observed in Q, spatially-coupled matrices in scheme I signif- 
icantly improve the performances. 

3) Scheme III: We use the spatially-coupled sensing matrix 



described in Section II-A and an AMP algorithm with soft- 
thresholding denoiser 



Vst(z;9) = 11- 



(22) 



The algorithm is defined as in Eq. (|9j, except that the soft- 
thresholding denoiser is used in lieu of the posterior expecta- 
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Fig. 3. Phase transition lines for Schemes I, II, III. 

tion. Formally, let T] t (v) = (»/t,i(vi), ■ ■ ■ ,Vt,n(v n )) with 
(vi) = r 1ST {v i ,a*{e)sT 1/2 ), Sl = E ^^(t)" 1 , (23) 



?7/ 



and the sequence of profiles {0(i)}t>o is given by the follow- 
ing recursion. 

Mt + !) = E w ™ E {Mx + s~ 1,2 z- a* s ; 1/2 ) - X\ 2 }. 

i£[n] 

Finally a* = a*(e) is tuned to optimize the phase transition 
boundary. This is in fact a generalization of the complex AMP 
(CAMP) algorithm that was developed in ifTTI for unstructured 
matrices. CAMP strives to solve the standard convex relaxation 



minimize i h 



E 

u6f!„ 



, subject to AfX = y. 



For a given s, we denote by Sg 1 (s) the phase transition location 
for l\ minimization, when sensing matrices with i.i.d. entries 
are used. This coincides with the one of CAMP with optimally 
tuned a = a*(e) flU, HH- 

The empirical phase transition of Scheme III is shown in 
Fig. [3] The results are consistent with the hypothesis that the 
phase boundary coincides with 6g 1 . In other words, spatially- 
coupled sensing matrix does not improve the performances 
under l\ reconstruction (or under AMP with soft-thresholding 
denoiser). This agrees with earlier findings by Krzakala et al. 
for Gaussian matrices (H71. and private communications). This 
can be inferred from the the state evolution map. For AMP 
with posterior expectation denoiser, and for e < S < 5(e), the 
state evolution map has two stable fixed points; one of order 
a 2 , and one much larger. Spatial coupling makes the algorithm 
converge to the 'right' fixed point. However, the state evolution 
map corresponding to the soft-thresholding denoiser is concave 
and has only one stable fixed point, much larger than a 2 . 
Therefore, spatial coupling is not helpful in this setting. 
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